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Results  and  concepts  in  the  theory  of  weak  convergence  of  a 
sequence  of  probability  measures  are  applied  to  convergence 
problems  for  a variety  of  recursive  adaptive  (stochastic 
approximation  like)  methods.  Similar  techniques  have  had  wide 
applicability  in  areas  of  operations  research  and  in  some  other 
areas  in  stochastic  control.  It  is  quite  likely  that  they  will 
play  a much  more  important  role  in  control  theory  than  they 
do  at  present,  since  they  allow  relatively  simple  and  natural 
proofs  for  many  types  of  convergence  and  approximation  problems. 
Part  of  the  aim  of  the  paper  is  tutorial:  to  introduce  the  ideas, 

and  to  show  how  they  might  be  applied.  Also,  many  of  the  results 
are  new,  and  they  can  all  be  generalized  in  many  directions. 
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1 . Introduction 


The  aims  of  this  paper  are  two-fold.  The  first  aim  is 
tutorial.  The  technique  of  and  the  results  in  the  theory  of 
weak  convergence  of  a sequence  of  probability  measures  have 
found  many  useful  applications  in  many  areas  of  operations 
research  and  statistics  [1],  [2].  Their  role  in  control  theory 
has  been  relatively  limited,  being  confined  mainly  to  the  work 
in  [3],  [4]  which  deal  with  control  problems  on  diffusion  models. 
Yet,  its  intrinsic  power  as  well  as  the  nature  of  the  past 
successes,  suggests  that  its  role  in  control  theory  should  be 
deeper  than  it  is  at  present . The  techniques  are  particularly 
valuable  when  convergence  or  approximation  ideas  are  being 
dealt  with. 

In  order  to  illustrate  the  possibilities,  the  ideas  of 
weak  convergence  theory  will  be  applied  (the  second  goal  of 
the  paper)  to  some  convergence  problems  for  an  interesting 
class  of  adaptive  processes.  These  processes  have  the  interest- 
ing stochastic  approximation  (SA)  like  framework  used  by  Ljung 
and  others  [5],  [6],  and  a number  of  practical  applications. 

The  application  to  the  convergence  problem  will  illustrate  some 
of  the  main  ideas  of  weak  convergence  theory.  Some  of  Ljung' s 
results  will  be  rederived. 

Sometimes  our  conditions  are  weaker,  and  sometimes  stronger. 
Our  proofs  are  generally  much  simpler.  They  appear  to  be 
readily  generalizable  to  more  abstract  cases,  and  conditions  on 
the  noise  and  coefficient  sequences  are  weaker.  The  ideas  used 
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here  allow  simpler  proofs,  and  focus  on  somewhat  different 
types  of  conditions.  They  are  essentially  "invariant"  with 
respect  to  "perturbations".  Other  advantages  will  be  discussed 
in  the  sequel.  For  example,  there  are  extensions  to  the  case 
where  state  space  constraints  must  be  included.  The  results 
here  do  not  replace  those  of  Ljung.  Both  methods  (which  are 
not  unrelated)  are  quite  interesting,  and  various  combinations 
of  them  may  well  prove  more  fruitful  than  either  one  alone,  in 
allowing  us  to  handle  broader  classes  of  SA  like  procedures  - 
which  include  more  realistic  noise  processes,  etc. 

The  classical  techniques  are  too  cumbersome  and  too  over- 
dependent on  special  properties  - such  as  square  summability  of 
certain  coefficient  sequences,  and  orthogonality  properties  of 
the  noise.  "On  line"  methods  of  identification,  for  example, 
usually  require  rather  weak  assumptions  on  the  noise  sequences. 
In  any  case,  more  powerful  methods  for  the  handling  of  such 
recursive  algorithms  have  been  long  needed. 

In  Section  2,  some  of  the  ideas  of  weak  convergence  theory 
are  introduced.  Section  3 elaborates  certain  points  and  cri- 
teria of  Section  2,  Section  4 develops  the  main  application, 
and  certain  extensions  are  discussed  in  Section  5. 

To  motivate  our  point  of  view,  suppose  that 
{Xn(.)  , T.^  <_  t _<  T2>  (possibly  T2  = ® and/or  T^  = -<*>)  is 
a sequence  of  random  processes  whose  paths  are  in  a path  or 
function  space  <%,  w.p.l,  for  each  n.  It  turns  out  that  if  we 
view  each  Xn(<)  as  an  abstract  valued  random  variable  (with 


values  m Sc) , and  study  the  sequence  of  measures  induced  on 
St  by  {Xn ( • ) } , then  very  useful  results  can  often  be  ob- 
tained on  various  limiting  (n  -*■  °°)  properties  of  the  sequence. 
For  this  reason,  it  is  useful  to  study  sequences  of  probabilities 
on  suitable  abstract  spaces,  even  if  the  applications  are  con- 
cerned. 
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2.  Weak  Convergence  of  Measures 


The  main  reference  is  Billingsley  [ 7 ] . See  also  Gikhman  and 
Skorokhod  [8],  chapter  9,  or  Chapter  2 in  Kushner  [4],  for  a 
brief  summary  of  the  basic  ideas.  Weak  convergence  is  a generaliza- 
tion to  abstract  valued  random  variables  of  convergence  in 
distribution.  The  statements  below  (unless  otherwise  specified) 
are  in  [ 7 ] , Chapter  1.  Let  SC  denote  a complete  separable 
metric  space.  Suppose  for  the  moment  that  the  processes+are  of 
interest  over  a finite  interval  [T1#T2] . Then  SC  is  usually 
taken  to  be  ClT^Tj]  or  D[T^,T2]  (or  Cm,Dm,  their  m-fold 
products) , where  C is  the  space  of  real-valued  continuous  functions 
with  the  sup  norm,  and  D is  the  space  of  real-valued  functions 
which  are  right  continuous,  continuous  at  T2,  and  have  left  hand 
limits  on  (T^,T2J.  The  space  D is  often  much  more  convenient 
to  work  with  than  the  space  C,  but,  for  simplicity  only,  C will 
be  used  here.  (See  [7  ],  or  [4  ],  Chapter  2,  for  a 
discussion  of  the  topology  which  is  usually  used  on  D.)  If  T^ 
or  T2  are  infinite,  then  the  usual  extension  of  the  topology 
on  C is  used  (convergence  is  then  equivalent  to  uniform  con- 
vergence on  finite  intervals) . 

Let  {Xn}  denote  a sequence  of  random  variables  with  values 
in  let  {Pn}  denote  the  corresponding  induced  measures  on  the 

(Borel)  sets  of  SC,  and  let  CKSC)  (resp. , Cp(SC)  , where  P is 
a measure  on  SC)  denote  the  set  of  real-valued  continuous, 
bounded,  functions  on  SC  (resp.,  real-valued,  bounded,  measurable 
and  continuous  almost  everywhere  on  SC,  with  respect  to  P) . The 


The  processes  of  concern  are  to  be  real  or  vector  valued.  St  is 
the  space  in  which  the  paths  lie  - not  where  the  values  lie.  A 
process  X(*)  is  considered  to  be  an  SC  valued  random  variable, 
where  convenient. 
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sequence  {Pn}  is  said  to  converge  weakly  to  P (written  Pn  =>  P)  if 

(2.1)  | f(y)Pn(dy)  - j f(y)P(dy) 

for  all  f C * ) e C(jj£).  If  (ZjJ.)  holds  for  all  such  f(»),  it 
also  holds  for  all  f(*)  e Cp(<2t).  This  is  an  important  generaliza- 
tion, since  many  of  the  f(’)  of  common  interest  in  control  theory 
are  not  continuous  everywhere+  (see  examples  in  [ 4 ] ) . Clearly, 

if  St  is  an  Euclidean  space,  then  weak  convergence  is  equivalent 

to  convergence  is  distribution.  Let  St  = Cm[T^,T2] , and  let  Pn  be  a 

measure  on  St  induced  by  a process  Xn ( • ) or  random  variable 
Xn.  If  P is  a measure  on  St,  there  is  a separable  Rm  valued 
process  X(*)  on  [T^,T21  with  continuous  paths  w.p.l,  which 
induces  P on  St.  If  Pn  =>  P,  we  abuse  terminology  and  say 
that  xn  (• ) -*■  X(*)  (or  Xn  -*■  x)  weakly,  or  in  distribution. 

The  sequence  {Pn}  (or  {Xn})  is  said  to  be  tight,  if  for  each 
e > 0,  there  is  a compact  set  K£  e St  such  that 

(2.2)  Pn{  K } = P(xn(.)  e k ) > 1 - e,  all  n. 

t.  c " 

If  {?n}  is  tight,  then  for  each  subsequence,  there  is  a further 
subsequence  (denoted  by  {Pn'})  and  a measure  P such  that 
Pn  =>  P.  Indeed,  tightness  is  necessary  and  sufficient  for  {Pn} 

+E . g. , f(*)  that  relate  to  exit  times  of  a process  from  a set. 


(corresponding  to,  say,  a sequence  of  pairs  is  tight 


if  each  component  {P?}  is  tight. 

Following  the  forementioned  abuse  of  terminology,  if  { Pn } 
is  tight,  we  may  say  that  {Xn}  is  tight,  then  that  there  is  a 
weakly  convergent  subsequence  of  {Xn}  with  limit  X (where, 
if  Xn  ( • ) is  a process  with  paths  in  a C space,  X(*)  = X will 
be  also) . 

In  practice,  the  {xn(*)}  can  arise  in  many  ways.  It  may 
be  a sequence  of  approximations  to  a process  or  optimal  process 
X(*)»  which  is  obtained  by  (say)  some  computational  procedure, 
and  it  may  be  desired  to  that  Xn ( • ) converges  to  X ( • ) in 
some  sense.  In  many  examples,  a problem  or  process  may  be 
parametrized  by  a scaling  factor  (as  is  often  the  case  in  applica- 
tions to  QueueingflJ)  or  other  parameter  a.  A limit  process 
(a  = 0 or  a = °°)  may  be  easy  to  study,  and  it  may  be  desired 
to  show  that  Xa  -*■  X°  (or  X°°)  in  a suitable  sense.  In  this 
paper,  the  process  Xn(*)  arise  in  a somewhat  different  way.  See 
Section  4 on. 
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sequence  to  a functional  of  the  limit  - in  distribution  or  ex- 
pectation. One  of  the  main  advantages  of  the  technique  is  that, 
once  we  know  there  is  tightness,  we  know  that  we  can  extract  and 
treat  convergent  subsequences.  It  is  not  necessary  to  prove  that  they 
exist  — as  will  be  seen  below. - This  is  a great  advantage. 

One  of  the  most  useful  tools  in  applications  of  weak  convergence 
theory  is  known  as  Skorokhod  imbedding  (see  Skorokhod  [ 9 ] , 

Theorem  3.1.1  or  [4],  Theorem  2.2).  The  theorem  is  the  following. 

Let  Pn  and  P be  induced  (on  SC)  by  the  SC  valued  random 
variables  Xn  and  X,  resp. , and  let  Pn  =>  P.  Then  there  is  some 
probability  space  tSl,  <2 , P ) with  SC  valued  random  variables 
{ Xn } and  X defined  on  it  such  that,  for  each  Borel  set  A in  SC, 


(2.3) 


P{Xn  e A}  = P{Xn  e A) 
P{X  e A)  = P(X  e A} 


and  Xn  ->  X w.p.l  in  the  topology  of  SC* 


Let  SC  = Cm (.-«>, °°)  , and  let  Xn  and  X be  9C  valued  random 

variables,  w.p.l.  Thus,  the  corresponding  processes  *{•) ,X(») , are 

defined  on  (-00,00)  , are  Rm  valued,  and  have  continuous  paths 

w.p.l.  Suppose  that  Pn  =>  P.  Let  Xn  and  X (corresponding  to 

Rm  valued  continuous  processes  X (*),X(»))  on  (fi,P,^) 

correspond  to  Xn  and  X,  by  the  Skorokhod  imbedding.  Then  Xn  -*■  X 

w.p.l  or  SC,  and  this  implies  that  sup  |Xn(t)  - X(t)|  -*■  0 w.p.l,  as 

1 1 1 <T 


n -*  °°, 


. While  the  distributions  of  Xn  (resp.,  of 


for  each  T < 00 
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X)  are  exactly  those  of  Xn  (reap.,  X),  the  probability  spaces 
are  different.  Usually,  we  are  concerned  mainly  with  characterizing 
the  limits  P,X,  and,  since  X is  equivalent  to  X in  the  sense  that 
it  induces  the  same  distributions  on  Cr  (-<*>,<*>)  (or  on  whatever  9L 

is),  the  properties  of  X often  yield  the. desired  properties  of  X. 
The  imbedding  allows  us  to  use  w.p.l  convergence  in  certain 
places.  When  the  imbedding  (and  consequent  change  of  probability 
space)  is  used  here,  it  will  be  so  stated  - but  the  tilde 
notation  will  not  be  used.  The  same  symbols  will  be  used  for  both 
the  original  and  the  ~ process. 

Consider  a simple  example.  Let  Xn,X  be  real-valued  with 
P{Xn  = 1}  = 1 - P{Xn  = 0}  = , P{X  = 0}  = 1.  Define  fl  = [0,1], 

Borel  sets  on  [0,1],  and  P = Lebesgue  measure  on  [0,1]. 


Define 

xn 

and  X by : 

Xn  = 1 on  [0, 

b- 

and 

zero  elsewhere. 

X = 0 

on 

[0,1] . Then 

(2.3)  holds,  and 

X 

X 

w.p.l.  The 

joint 

distributions  of 

. • • , , . . 

.,X 

and 

of  X^,.(.,X^,..., 

are  not 

, necessarily  the 

same;  but,  if  we 

are 

only 

concerned  with 

the  probabilistic  properties  of  the  limit  X,  we  can  just  as  well 
use  the  imbedded  process.  In  fact,  in  many  applications  each  Xn 
is  defined  on  a different  probability  space  anyway  (but  not  in 
this  paper),  in  which  case  the  "joint  distributions"  of 
{X^, . . . ,Xn, . . . ,X}  has  no  meaning  anyway. 


9 


3.  Criteria  for  Tightness  When  SC  - C (-°°,°°)  . 

£ 

Let  us  specialize  to  the  case  C [-T,T]  (see  Billingsley 
[7],  Section  8,  where  CI0,T]  is  treated,  for  details)  One  of  the 
critical  aspects  of  applying  the  theory  is  the  establishment  of 
reasonably  readily  verifiable  criteria  for  tightness.  Since  we 
usually  work  with  processes  and  their  properties, and  not  with  measures 
on  SC,  the  criteria  should  be,  if  possible,  in  terms  of  available 
data  on  the  processes. 

Suppose  that  is  a sequence  of  Rr  valued  continuous 

functions  on  [~T,T]  . Then  to  any  subsequence  of  there  is 

a further  subsequence  which  converges  to  an  element  of  C [-T,T] , 
if  and  only  if  the  sequence  is  bounded  and  equicontinuous 

(by  the  Arzela-Ascoli  Theorem) . Thus,  as  is  well  known,  the  compact 
sets  of  C [— T,T]  are  sets  of  equibounded  and  equicontinuous 
functions.  The  criteria  for  (2.2)  all  imply  that  the  paths  of 
Xn(.)  are  bounded  and  equicontinuous,  with  a "high  enough" 
probability,  where  the  bounds  and  moduli  of  continuity  are  not 
dependent  on  n. 

The  sequence  {Xn(»)}  is  tight  if  and  only  if,  for  each 
n >0,  there  is  an  N < 00  such  that 

n 

(3.1)  P{  |Xn(-)  | > N^l  < n,  all  n 

and,  for  each  e > 0,  n > 0,  there  is  a 6 e (0,1)  and  an  n^  < 00 


such  that 
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(3.2)  P{  sup  |xn(t)  - Xn(s)|  > e]  < g,  for  n > n . 

|t-s|<<S 
— T< t , s<T 


If,  for  each  e > 0,  r|  > 0,  there  is  a 6 e (0,1)  and  nQ 
such  that 

(3. 3)  P{  sup  |xH(t)  - Xn(s)  | > e}  < t}6,  for  n > nn  and  -T<s<s  + 6<T, 
s<t<  s-+  6 

then  (3.2)  holds.  Equation  (3.3)  is  guaranteed  if  there  is  a real 
K and  an  a > 0,  b > 0,  such  that 


(3.4)  E|xn(t)  - Xn ( s ) | 3 < K|t-s|1+b,  all  n. 

! 

j 

i 

For  the  case  C (-00,00)  , we  only  need  to  satisfy  the  criteria  on 
each  [-T,T]  , where  N^,  K,a,b,n0,<5  can  also  depend  on  T. 

Of  course,  in  special  applications,  much  work  can  be  devoted  to 
showing  that  (3.2)  or  (3.3)  or  (3.4)  hold.  Note  that  the  criterion 


(3.4),  for  a fixed  n,  is  simply  Kolmogorov's  criterion  for  the 
path  continuity  of  a separable  random  process.  Here  the  criteria  - 


or  (3.2)  or  (3. 3) -must  hold  for  all  large  n with  the  constants  not 
depending  on  n.  This  is  hardly  surprising. 

It  is  often  easier  to  show  tightness  in  a D space  than  in  a 
C space,  since  the  conditions  are  weaker  for  the  former.  Also, 
working  with  D,  it  is  often  possible  to  show  that  the  limits  are 
continuous  anyway.  To  avoid  more  descriptions,  we  stick  to  C. 
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4.  A General  Adaptive  Algorithm 

Let  {£n>  denote  a sequence  of  R valued  random  variables, 
and  {an}  a non-negative  null  sequence  of  real  numbers  satisfying 


(4.1)  l an  = 

n 

r t 1 it 

Let  Q ( • , • ) denote  a continuous  bounded  function:  R * r r , 

and  consider  the  algorithm 

(4.2)  Xn+1  = Xn  " anQ(Xn'Cn)f  n > °'  X0  given* 

Many  adaptive  and  identification  procedures  fit  into  the  form 

(4.2) .  Unbounded  or  even  discontinuous  Q(*,*)  can  also  be 

treated.  Specific  additional  assumptions  will  be  introduced 
below.  The  aim  here  is  solely  to  discuss  the  convergence 
properties  of  {Xn>,  and  to  illustrate  how  the  techniques  can 
be  used  in  the  treatment  of  applications,  but  not  to  deal  with 
more  specific  applications  of  (4.2).  (A  number  of  applications 
are  discussed  in  [5],  [6].)  To  do  this,  using  the  ideas  of 
Sections  2 and  3,  must  be  interpolated  into  a continuous 

time  function.  A natural  interpolation  is  suggested  by  the 
form  of  (4.2);  the  "an"  ig  a natural  time  interval  for  the 
interpolation . 

+More  classical  SA  procedures,  with  and  without  constraints,  are 
treated  in  [10]  by  somewhat  similar  methods. 
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Define 


+ 


n-1 

t = 0,  t = l a.,  m(t)  = max { n s t < t}, 
u n i=Q  i n 


and  define  a process  X°(»)  by  X° (t)  = Xq,  t < 0,  X°(t)  = X°(tn)  “ 

Q(V«n>  (t-V  on  '‘n'Vl1'  Define  Xn(*)  (n  = 1,2,...)  by: 
Xn(t)  = X°(t  +t)  for  t > -t  , and  equal  to  X„  = Xn(-t  ),  for 
t 1 ~tn»  Thus  X°(-)  is  a piecewise  linear  interpolation  of 
{Xn>,  with  interpolation  intervals  (an>,  and  Xn(*)  is  a left 
shift  of  X°(*)  by  tn-  The  purpose  of  the  sequence  of  left  shifts 
is  to  (eventually)  bring  the  "asymptotic  part"  of  the  into 

some  finite  time  interval.  Define  the  piecewise  constant  inter- 
polations: X(t)  = Xn  on  ltn,tn+1),  Ut)  = £n  on  (tn,tn+1), 

with  X(t)  = XQ , X (t)  = t < 0.  Then 


(4.3) 


X°(t)  = XQ  - I°(t) 
Xn(t)  = Xn(0)  - In(t), 


where 


In(-) 


is  defined  by 


+The  m(t)  and  t terms  will  be  used  frequently,  since  the  theory 

requires  us  to  work  with  the  interpolated  processes,  but  the 
properties  of  the  sequences  must  be  referred  to  constantly 

So,  we  go  back  and  forth  between  the  interpolated  process  and  the 
sequences.  The  m(t)  and  t^  allow  us  to  keep  track  of  the  times 

at  which  the  values  of  one  are  the  values  of  the  other.  They  also 
are  responsible  for  most  of  the  notational  difficulties  in  the  paper 
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Q (X  (t  +s),£  (t +s))ds,  t > -t 

\ II  II  II 


= I (-t  ) , t < -t  . 
n - n 


Before  proceeding,  let  us  introduce  some  additional  assumptions. 


(Al)  sup  P{|x|>K}-*-0  as  K 
n n 


(A2 ) There  are  measurable  real  valued  functions  g ( • ) , 0 ( • ) 
such  that  0(e)  -*■  0 as  e -*■  0 and 


Q(x,y)  - Q (x 1 , y ) | < g (y ) 0 ( | x-x ' | ) 


Define  GnCt)  = g (T  (t_+s))ds,  t e (-00,00)  , and  suppose  that 

0 n 

{ Gn  ( • ) } is  tight  on  c(-°°,«>). 

(A3)  There  is  a random  variable  £ such  that,  for  any  bounded 
and  continuous  real  valued  function  f(’): 


E[fC£n+Jc)|  £^,  i < n]  •»  Ef  C£) , as  n,  k -►  <*>. 


(A4)  Define  A(*)  by  A(x)  = EQ(x,£).  (Note  that  A ( * ) is 

bounded  and  continuous.)  Suppose  that  S,  the  set  of  zeroes  of 

A ( • ) , is  bounded  and  connected,  and  that  x = -A(x)  is 

asymptotically  stable  (to  S) . 

Without  the  connectnedness  property,  the  conclusions  below 

P 

(4.6)  are  to  be  replaced  by:  — > largest  finite  invariant  set 
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j j 

h 


> 

j 

t 

l 


f r : 

? fH 


n 


of  x = -A(x),  and  this  invariant  set  replaces  S in  (4.7). 

(Al)  can  frequently  be  verified  by  a Liapunov  function  type  of 
method,  or  Q(*,*)  may  be  constructed  to  have  some  type  of  in- 
trinsic stability.  The  aim  here  is  to  show  the  convergence, 
assuming  that  the  mass  does  not  wander  off  to  infinity.  Both 
(A2)  and  (A3)  can  be  weakened.  Indeed,  there  need  not  exist 
such  a £ for  the  method  to  be  used,  but,  then,  a weaker  assumption 
and  more  detail  need  to  be  added.  In  many  problems  involving  side 
constraints,  (}(•,•)  and  A(*)  are  discontinuous  and  indeed,  the 
function  that  replaces  A(«)  may  even  be  multivalued.  These 
matters  will  be  dealt  with  in  a subsequent  paper.  Also,  various 

abstract  valued  5 and  X can  be  treated.  The  various  possible 

n n 

extensions  indicate  the  power  of  the  technique.  Here,  we  try  to 
keep  the  structure  relatively  simple,  in  order  to  illustrate  the 
basic  ideas. 


Both  Xn(.)  and  In  ( • ) have  paths  in  Cr  (-»,<»)  for  each  n. 

By  CAl)  and  the  boundedness  of  Q (*,*),  both  (3.1)  and  (3.2)  hold 
for  {Xn  C*  )»in  ) } • Hence  that  sequence  is  tight  on  C2r  (-°°,°°)  . 

Henceforth,  let  N index  a weakly  convergent  subsequence  (of 
the  measures  induced  on  C2r  (-»,«>)  by  (Xn  (• ) , In  (• ) } ) . The  measure 
2r 

on  C C-00/"0),  which  is  the  limit  of  the  weakly  convergent  sub- 
sequence, is  induced  by  a process  X(*),I(*)  with  continuous  paths. 
Using  Skorokhod  imbedding,  XN(t)  -*  X(t),  IN(t)  I(t),  w.p.l, 
uniformly  on  finite  t-intervals.  The  process  I(*)  is  absolutely 
continuous,  and  we  can  suppose  that  | X (t) | < sup|Q(x,y)|.  Define 

x»y 

ft 


Q ( • ) by  i(t)  = Q (t ) . Thus,  X(t)  = X(0) 


Q(s)ds.  Of  course, 


0 
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(the  measures  of)  all  these  limits  may  conceivably  depend  on  the 
particular  convergent  subsequence  which  is  selected.  The  limit 
X ( • ) will  now  be  analyzed,  and  it  will  be  shown  that  (the 
convergence  result)  (4.5),  (4.7),  (4.8)  hold. 

Now,  suppose  that  Skorokhod  imbedding  is  used  - so  that  we 
can  suppose  that  (XN(*),X(*)}  are  all  defined  on  the  same 
probability  space. 

For  t e (-00,00)  , define  (recall  X^(s)  = X°(tN+s)) 


EN  (t) 
FN  (t) 


■t 

0 

(■t 

0 


[Q(XN(s),5  (t^j+s))  - Q(X  (tN+s),I  (tjj+snids 
[Q(XN(s),5  (tjj+s))  -Q(X(s),I  (tN+s))]ds. 


Since  Q(*,*)  is  bounded,  (EN  (• ) ,FN  (• ) } is  tight  on  C2r  (-00,00). 
Let  N'  denote  a weakly  convergent  subsequence  of 
{EN  (• ) ,FN(* ) ,XN(- ) ,IN  (• ) ,GN  (• ) } , and  suppose  (henceforth)  that 
Skorokhod  imbedding  is  used.  Note  that,  by+(A2), 


| eN ’ (t) | < max  0 ( | XN ' (s)  -X  (t^,+s) |)GN' (t) , |t|  < T. 

| s | <T 

Since  GN'c*)  converges  to  some  limit  G(*)  e C (-»,“),  and  the 
max  term  goes  to  zero  as  N’  -*■  °°,  the  limit  of  {E  (•)}  is  the 

N * 

zero  process.  Similarly  for  {F  (•)}.  These  limits  do  not  depend 


+The  only  difference  between  Xn(s)  and  X(tn+s)  is  that  the  first 

is  a piecewise  linear  and  the  second  a piecewise  constant  inter- 
polation. They  are  equal  at  the  s = fcn+i  “ tn>  a11 
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on  {N'J,  and  the  actual  limit  G(*)  is  not  important  (only  its 
existence  is  important),  and  so  we  suppose  that  N'  = N,  and  re- 
turn to  the  original  subsequence.  (The  above  argument  will  be 
implicitly  used  several  times  in  the  sequel  - usually  when  (A2) 
is  appealed  to.)  Thus,  the  limits  of  {I  (•))  are  the  same  as 
those  of  {5^(*)}»  where  we  define 

t 

Q(X(s),C  (t  +s))ds,  t > -tN, 

0 


I**  (t)  = 


and 


I*1  Ct)  = ^ C~tN)  , t < -tN. 


Clearly,  I (• ) is  easier  to  work  with  than  is  IN(*).  We  will 
try  to  simplify  !**(•), 


now 


Define  the  function  X^(*)  X^Cs  ) = X(iA)  on  [iA,iA+A), 

i = 0,±1,...  . Define  I^J(*)  as  !*?(•)  was  defined,  but  with 


X A C * ) replacing  X ( * ) . Suppose  that  I^(t) 


A(X^(s))ds  weakly 


as  N -*■  °°.  Then  by  (A2)  , and  the  convergence  of  X^(*)  to  X(«) 
as  A ■+•  0 (uniformly+w. p.  1 on  finite  time  intervals),  the  limit 


of  (T^C*))  has  values  [ A(X(s))ds.  In  particular,  we  can  consider 

J 0 

one  interval  at  a time:  we  need  only  consider  the  limit  of 

ft  r 

{ Q(x,£  (t„+s))ds,  t e (iA,iA+A)}  for  each  x e R . 

J u N 


Assuming  Skorokhod  imbedding  is  used. 


I 


H 


Define  Q(x,£  (tN+s) ) = Q('x,C  (tN+s) ) - EQ(x,£)*  The  sequence 
{KNC*)>  with  values  (t  > iA) 


KNCt)  = f Q(x,£  (tM+s))ds 
JiA  w 


is  tight  on  Cr[iA,iA+A],  since  Q is  bounded.  To  show  KN  ( • ) 
zero  process  weakly,  as  N -*•  <»,  it  is  only  necessary  to  show  that 
E|K  (t)  | -*■  0 as  N -*•  °°.  To  do  this  with  simple  notation,  do  it 
component  by  component.  In  particular  (w.l.o.g.),  suppose  that 
QO,*)  and  Q ( * , - ) are  scalar  valued.  Note  that  E | KN  (t ) | 2 has 
the  same  limit  as  has 


m(tN+t) 


l l a.a.Q(x,5  )Q(x,£  ) 
,j=mttN+iA)  K ^ k 3 


(4.4) 


mCt^+tJ-mtt^+iA) 

kJ=0  E ^(t^+iAJ+k  Vt^+iAJ+j  Q{x,5m(tN-HiA)+k)  Q(x,Sm(tN+iA)+j)* 


For  a 6 > 0,  6 < t - iA,  first  consider  the  sum  in  (4.4)  over 

those  k,j  such  that  |t +i4)+k  - tm (.+iA)+jl  ; S.  This  sum  is 

N N 2 

bounded  from  above  by  some  constant  times  6 . Thus,  to  show  that 


(4.4)  -+•  0 as  N -*■  00 , we  can  suppose  that 


^(t^+iAJ+k  " tm(t^+iA)+j ' - 6f 


for  an  arbitrary  6 > 0.  In  particular,  we  can  assume  that 


Vt„+ia)+k  i t»(t.+i4)+j  + S-  As  N " the  k,j  satisfyih9 

N N 


this  relationship  also  satisfy  k - j -*•  °°,  since  a^  -►  0 as  n °°. 
Using  these  facts,  together  with  (A3)  and  the  definition  of  Q(*,*)  , 

yields  that,  for  such  k,j  pairs. 


r 

» 


1 
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E[Q(x,5m(t  +iA)+k)|Su»  u < m(tN+iA)  + j]  0 w.p.l  as  N -►  ». 
N 

This  result,  together  with 

m(tN+t) 

lim  l a.  = (t-iA) 

N-x»  j=m(tN+iA)  ^ 


and  the  arbitrariness  of  <5,  implies  that  (4.4)  ■+  0 as  N ■+■  °°. 

Thus,  I^(t)  -*■  [ A (X^  (s)  )ds,  as  N 
' 0 


Finally,  combining  the  above  results,  we  have  that 


^(t),  - 


A(X(s))ds,  w.p.l,  on  finite  t-intervals,  and,  hence. 


(4.5)  X(t)  = -A  (X  (t ) ) , t e (-«,“), 

where  X(0)  may  possibly  depend  on  the  particular  convergent 
subsequence. 

By  (Al)  and  the  weak  convergence 

(4.6)  sup  P { f X ( t ) | < K } 1 as  K+°°. 

I 1 1 <00 

By  (A4)  and  (4.6),  the  paths  of  X(*)  are  bounded  w.p.l,  whatever 
the  convergent  subsequence.  Under  (A4),  the  bounded  trajectories 
on  (-00,00)  must  lie  in  S.  Thus,  X(t)  e S,  t e (-»,“).  Since  S 
does  not  depend  on  the  selected  subsequence,  and  since  any  sub- 
sequence  of  (X  (•),!  (• ) } has  a weakly  convergent  subsequence. 
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Xn  ->  S In  probability , as  n -*■  To  strengthen  the  result,  fix 
T and  note  that  the  functions  f(*):  cr(-°°,°°)  -*•  R defined  by 

f(x(*))  = sup  |x(t)|,  or  by  sup  dist . (x (t) ,S) 

-T< t<T  |t|<T 

are  continuous  on  cr  Thus,  by  the  weak  convergence  , 

(4.7)  P{  sup  dist.  (Xn(t)  ,S)  > e}  -*■  0 

1 1 1 <T 

as  n ->  «,  each  T > 0,  e > 0.  It  is  also  true  that  (for  any 
convergent  subsequence  N and  limit  X(*)) 

(4.8)  P{  sup  | XN  (t)  - X (t)  | > e)  -►  0 

1 1 1 <T 

as  N -*■  °°,  each  e > 0. 

N 

Unbounded  Q . The  basic  problem  concerns  tightness  of  (I  (•)), 
and  some  condition  which  guarantees  this  need  to  be  added. 

A special  case  is  Q(x,y)  = QQ(x,y)  + (y)  , where  QQ  is  bounded, 

and  the  processes  whose  values  are  the  "natural"  integrals 
■ t 

Qi  (5 (t  +s) )ds  are  tight.  Another  special  case  is  where 

Jo  1 n 

| Q (x,y) | < const.  (1+ |y | ) , and  (the  constant  is  independent  of  n) 

■t  ft  _ _ _ _ 2 

EQ (X  (t  +s),  ? (t  +s) ) Q (X  (t  +t),  l (t  +T))dsdT  < const. (t  ) 

J 0J  o n n n n 

But  the  matter  will  not  be  pursued.  The  aim  here  is  to  illustrate 
the  technique  - not  to  develop  the  very  best  conditions. 
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Remark.  The  basic  convergence  result  was  obtained  with  relatively 
little  pain,  and  the  proof  followed  a rather  natural  set  of  ideas.  The 
asymptotic  problem  was  reduced  to  a natural  one  concerning  the 
properties  of  an  ordinary  differential  equation  (4.5),  as  in  [5],  [6], 
[10] . Many  variations  are  possible.  With  a suitable  alteration  in 
the  algorithm,  even  the  case  where  there  is  an  occasional  system 
disturbance  of  an  impulsive  type  can  be  treated,  as  can  various  cases 
where  the  (a  } are  random  variables. 


5.  Inputs  (5n>  Depending  on  the  Iterates  {Xn> 


In  some  applications,  it  may  be  desired  to  let  £n  depend 

on  X , or  even  on  X ,X  for  example,  in  an  identification 

n n n-i 

and  control  problem,  where  the  current  input  may  be  allowed  to 
depend  on  the  current  parameter  estimate.  This  problem  was 

treated  in  [5] , where  a specific  application  is  given,  and  we 
re-do  the  general  type  of  result  given  there,  using  the  weak  con- 
vergence technique,  and  somewhat  weaker  assumptions  on  the  noise. 

Let  {iJ^}  denote  a given  sequence  of  random  variables,  and 

h (*,*,•)  a measurable  function  such  that  f;  = h(£  ,ip  ,X  ,).  We 

n+i  n n n+l 

take  this  form  in  order  to  be  specific  - but  more  general  forms  can 
be  used.  Additional  conditions  will  be  introduced  below.  Unless 
otherwise  specified,  we  retain  the  conditions  and  terminology  of 
the  previous  section.  Again,  our  interest  is  in  the  techniques  and 
methods  of  utilizing  the  weak  convergence  ideas  for  the  type  of  general 
application  with  which  we  deal,  but  not  in  the  more  specific  applica- 
tions, of  which  there  are  many. 

Under  (Al)  and  the  bound  on  Q(*,«),  {Xn ( • ) , In (• ) > is  still 
2r 

tight  on  C (-00,00)  . Fix  a weakly  convergent  subsequence  - also 

to  be  indexed  by  N,  and  with  limit  X(*),I(*).  Note  that  (4.8) 

still  holds.  To  get  a convergence  result,  the  net  or  average 

effect  of  X on  £ must  vanish  as  m - n -*■  «>  and  some  condition 

n m 

guaranteeing  this  must  be  introduced.  To  this  end  and  for 
reference  below  let  q -*•  «>  as  n -*>  °°  and  define,  for  each  x 
and  n = 0,...,  the  sequences 
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(5.1) 


?k+lU)  = h^^)^qn+k'X)' 

^k+l(x)  = hUj(x)\+k'x)  + Ek  ' k = O'1'**" 


where,  for  each  x and  n,  £q(x)  = Cg(x)r  and  ^ek'  k = O'1'***^  is 
is  a sequence  of  "errors". 

Assume 


(A5)  {£n>  is  tight,  or,  equivalently,  bounded  in  probability, 

uniformly  in,  (on  the  Euclidean  range  space) ; i.e. , 

sup  P{  | C | > K)  -*■  0 as  K 
n 

(A6)  (replaces  (A3))  If,  for  some  fixed  x,  {|^(x)}  is  tight* , 
then  there  is  a random  variable  £x  such  that,  for  each  bounded 
and  continuous  function  f(‘), 


E[f  C^i+k(x))  I ^j(x)'  3 ± i]  - Ef(SX)' 


uniformly  in  n,  as  i,  k -*•  °°.  The  function  of  (x,y)  with  values 
EQ(y,?X)  is  continuous.  Define  A(x)  = EQ(x,£;x),  and  let  A(*) 
have  the  properties  of  the  A(*)  iji  (A4). 

(A7)  For  each  6 > 0,  t > 0,  suppose  that  there  is  an  e > 0 
such  that  {|e£|  < e for  0 < k < m(tn+t)-n)  and  tightness-*-  of 
the  initial  sequence  {(^(x)}  = {^(x)}  imply  that 


lim  sup  P{|f;]J(x)  - i”  (x)  | > 6}  < 6. 
n-*-°°  kik^n^+tj-n 


equivalently,  boundedness  in  probability,  uniformly  in 


n. 


Assumptions  (A5) , and  (A6)  do  not  seem  to  be  restrictive,  and 
the  condition  requiring  existence  of  £ can  also  be  weakened. 
Condition  (A7)  says,  basically,  that  the  { £n>  process  has  an 
inherent  stability  for  each  fixed  XR  = x,  in  the  sense  that 
small  perturbations  to  h(*,*,*)  do  not  seriously  affect  the  value 

of  un>- 

CA8)  h (*,*,•)  is  continuous  in  its  third  argument,  uniformly 
in  the  first  two:  I.e.,  for  some  0 C * ) with  0(u)  -+■  0 £s  u -*•  0, 

| hCS^x' ) - h(5,^,x)  | < 0(|x-x'|),  uniformly  in 

Condition  (A8)  can  be  replaced  by  more  general  alternatives.  For 

example,  if  (A7)  is  altered  such  that  { | e” | < e,...}  is  replaced  by 
i 

,SUP  I le£l  < e},  then  (A8)  can  be  replaced  by 

i<m(tn+t)-n  k=0  K 

(A8  ' ) Let  there  exist  9 C • ) , g^  ( • ) such  that  0(u)  -*■  0 as  u -*•  0 and 
|hU,i|»,x')  ~ h(C,i|»,x)|  < 0 (|x-x*  | )gx(C,^)  # 


and  lfit  {G"(t)}  be  tight  (gJ(-)  is  defined  anologously  to  Gn ( • ) 
in  (A2 ) ) . 

The  aim  of  this  Section  is  to  show  the  same  end  result  as  was 
shown  in  Section  4,  namely  that  X(*)  satisfies  X(t)  =-A(X(t)) 
(A(*)  defined  in  (A6) ) , from  which  we  get  &s  in  Section  4)  that 

P 

X ( t)  — > S,  and  (4.7).  As  in  Section  4,  it  suffices  to  show  that 
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Q(X(iA),£  (tM+s)  ) ds 
* N 


EQ(x,C  )ds 
iA  1 x=X (iA) 


weakly  on  C [iA,iA+A]  as  N ■+■  °°,  for  each  A > 0 and  i.  We 
always  suppose  that  Skorokhod  imbedding  is  used.  Now,  define  the 
initial  condition  and  error  terms  in  (5.1)  by  (we  select  them  to 
enable  us  to  deal  with  the  processes  on  the  interval  [ iA , iA+A] ) 

^0tx)  = ^m(t  +iA)  = ?0(x)'  a11  x' 

N 

qN  = m(tN+iA)  ' 


N 

ek  = ^mCty-iAl+k'  ^(t^iA)  +k'  Xm  (tN+iA) +k+l} 


“ hC5mCtN+iA)+k'  *m(tN+iA)+k'  X(iA,) 


Thus 


(5.3a) 


5m(VU)+k  ‘ «k(x(i4>> 


(5.3b)  5(tN+i4+ul  * ?m(t„+iA+u)-m(t„+i4) (X(iA) 1 ' for  u i 0 

N N 


By  (A5) , the  tightness  requirement  on  {^”(X(iA))>  = f E 

u m(t^+ia) 

in  (A7)  (for  x = X(iA))  is  satisfied.  Next,  note  that  (5.4) 
follows  from  (A8)  and  (4.8). 
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(5.4)  lim  lim  P{  sup  |ejj|  > e}  = 0,  each  e > 0. 

A^O  N-*°°  0<k^(t^+iA+A)  ^(t^+iA)  K 

By  (5.3),  (5.4)  and  (A7) , there  is  a 6^  which  goes  to  0 as 

A -*■  0 and  such  that 


(5.5) 


lim  E 


■iA+A 

iA 


| Q (X  (iA)  , 


K (tN+s))  - Q (X  (iA) , r (X(iA),s))  |ds  < & 


where  we  define  £ (x,s)  to  be  the  piecewise  constant  right 
continuous  interpolation  of  {^(x)},  with  interpolation  intervals 

{am(t  +iA)'  am(tKI+iA)+l' ‘ } 

N N 

By  (5.5),  in  order  to  prove  (5.2),  we  need  only  show  that  the 

AM 

sequence  of  functions  Q (•)  on  the  interval  [iA, iA+A]  and  with 
values 


(5.6) 


AM 

(T  (t) 


= 11 


(x»s)  )ds. 


where 


S(x,fN(x,s))  = Q(x,|W(x,s))  - EQ(X,?X), 

tends  weakly  to  the  zero  process  as  N -*  »,  for  each  x.  As  done 
in  Section  4,  we  can  suppose  that  Q(*,*)  is  scalar  valued.  Then, 
(5.6)  has  the  value 
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m(t^+iA+t)  - mCt^+iA) 

(5*7)  j^0  ^ ^C^+iAJ+k  amttN+iA)+j1®(x,^k(x))^(x,^j(x))' 


tN, 


Now,  by  (A6) , and  an  argument  like  that  associated  with  the  con- 
vergence of  (4.4),  and  with  f(*)  = Q(x,*),  we  have 

i 1 2 ''N 

E|Q  (t)  | -*  0,  which  implies  that  Q (•)->-  zero  process  weakly. 

/ 

The  demonstration  of  convergence  is  now  complete,  and  (4.5)  and 
(4.7)  continue  to  hold. 

The  proof  was  relatively  straightforward,  and  the  assumptions 
not  unreasonable.  The  tightness  assumptions  and  weak  convergence 
techniques  allow  a relatively  simple  treatment  - one  which  focuses 
on  the  basic  structures  and  does  not  get  overinvolved  in  detail. 
Generalizations  to  abstract  valued  problems  are  also  possible. 


6.  An  Identification  Problem 


Again,  following  an  application  in  [5] , we  discuss  an  algorithm  for  th< 
identification  of  the  coefficients  9 = (A^ , . . . , A^ , . . . , , . . . , B? ) 
in  the  system 


(6.1) 


y + A.y  -i  + 
Jn  rn-1 


+ A 


HYn-Z 


= B.U  . + 
1 n-1 


+ Vn-l  + 


V 


where  {p^}  is  some  sequence  of  random  variables.  For  notational 

simplicity,  let  the  A^B^  be  scalars;  the  general  case  is  treated 

in  exactly  the  same  way.  Define  = (-yn-1,  . . . r — yn_®, ' un— 1 ' * • * ,un-£^  * 

Then  = ®'^n  + pn  * Suppose  that  the  l.h.s.  of  (6.1)  is  asymptotically 

stable,  and  (un,pn>  satisfies  the  condition  in  (A3)  on  {£n}.  Let 

ip  be  known  at  time  n. 
rn 

+*  V* 

Let  Y denote  the  n estimate  of  9.  A common  recursive 
n 

estimation  algorithm  is  given  by  (6.1)  - for  suitable  functions 

Q1(-),Q2(-)- 


(6.1) 


R 


n 

K 


n 

Y 


n 


Vi  * WVA  - Vi> 

Rn1*n/IltanWARn1V1>! 
Vl  + an«2  (Kn  » 


Usually  the  Q^t*)  are  the  identity  functions.  Let  Q^-) 
identity,  and  suppose  that 


VYn 


(6.2) 


is  bounded  in  probability,  uniformly  in  n. 
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and  that  there  is  a positive  definite  matrix  H such  that 
(6.3)  1 i nl  ^ H»  as  n*k  •*  “• 

Define  B°(*):  B°(t)  = 0,  t < 0,  and  B°(t)  = B°(tn)  + (t-tn)^n^ 
on  [t^+1  “ fcn]  t and  define  Bn(*)  by  Bn(t)  = B°(t+tn).  Also, 
define  Rn(t),  and  Yn(t)  as  Xn(*)  was  defined  in  Section  4,  but 
with  the  r.h.s.  of  the  first  and  third  lines  of  (6.1)  replacing  the 
r.h.s.  of  (4.2).  Only  a rough  outline  will  be  given.  Under  reasonable 
conditions  on  the  sequence  {Bn  (• ) ,Rn  (• ) } is  tight  on+  C2s(-°°,00). 

Then,  by  the  result  of  Section  4,{Rn>  tends  in  probability  to  the 
constant  limit  solution  R of 

R = H - R, 

and  (4.7)  holds  with  XN(t),  S replaced  by  Rn(t),R. 

The  convergence  of  (Yn>  can  also  be  treated  with  Q2  = identity. 
But  in  order  to  be  able  to  appeal  directly  to  the  result  of 
Section  4,  without  further  work,  let  Q2(*)  be  bounded  and 
continuous:  for  example,  each  component  of  Q2(*)  can  be  a 

saturation  function.  (Indeed,  it  is  possible  to  study  the  limit 
as  a function  of  the  saturation  level.)  Some  additional  conditions 
need  to  be  introduced  to  assure  that  (A2)  and  (A3)  hold  (here  £n 
is  replaced  by  ^n»Pn)*  These  conditions  are  not  unreasonable,  but 
to  save  some  space  and  discussion,  simply  assume  (A2),  (A3).  Define 

A2(*,*)  (a  function  of  a matrix  M and  vector  Y)  by 

a2CM,y)  = 11m  E Q2(«  Vk'Vk-1' Vk1 1 VV  1 i ">• 

s = number  of  elements  in  R . 
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Let  A2  O , • ) and  Y = A2  (R  \y)  have  the  properties  of  the  A(*) 
and  x = -A(x)  of  CA4) . Then  converges  in  probability  to 

the  limit  set  of  Y = A2(R-1,Y)  and  (4.7)  holds  also. 


7.  Conclusions 


Some  of  the  concepts  of  weak  convergence  theory  have  been 
introduced  and  applied  to  convergence  problems  for  a family  of 
recursive  adaptive  procedures.  The  conditions  and  ideas  are 
rather  natural  for  that  type  of  problem,  and  the  proofs  are 
relatively  simple.  There  are  possible  extensions  in  many  directions 
It  is  expected  that  the  techniques  will  play  an  important  role  in 
control  theory. 


31 


References 


[1] .  D.E.  Iglehart,  "Diffusion  approximations  in  applied  probability" 

Mathematics  of  the  Decision  Sciences,  Part  II,  Lectures  in 
Applied  Math.,  12,  1968,  American  Math.  Soc.,  Providence,  R.I. 

[2]  D.E.  Iglehart,  "Weak  convergence  in  queueing  theory",  J.  Appl. 
Prob.,  5,  pp.  570-594,  1973. 

[3]  H.J.  Kushner,  A survey  of  some  applications  of  probability  and 
stochastic  control  theory  to  finite  difference  methods  for 
degenerate  elliptic  and  parabolic  equations",  SIAM  Rev.,  18 , 
no.  4,  pp.  545-577,  1976. 

[4]  H.J.  Kushner,  Probability  Methods  for  Approximations  in 
Stochastic  Control  and  for  Elliptic  Equations,  Academic  Press, 
to  appear  February,  1977. 

[5]  L.  Ljung,  "Analysis  of  recursive  stochastic  algorithms", 

Report  7616  (C)  March  1976,  Dept,  of  Automatic  Control,  Lund 
Inst,  of  Technology.  Submitted  to  IEEE  Trans,  on  Automatic 
Control. 

[6]  L,  Ljung,  T.  Soder  strom,  I.  Gustavsson,  "Counterexamples  to 
general  convergence  of  a common ly  used  recursive  identification 
method",  IEEE  Trans,  on  Automatic  Control,  AC-20,  no.  5, 

October  1975,  pp.  643-652. 

[7]  P.  Billingsley,  Convergence  of  Probability  Measures.  John  Wiley 
New  York,  1968. 

[8]  I. I.  Gikhman,  A.V.  Skorokhod,  Introduction  to  the  Theory  of 
Random  Processes,  Saunders,  Phiia.,  1969. 

[9]  A.V.  Skorokhod,  "Limit  theorems  for  stochastic  processes", 

Theory  of  Probability  and  its  Applications,  !L,  1956, 

pp.  262-290  (English  translation) . 

[10]  H.J.  Kushner,  "General  convergence  results  for  constrained 
and  unconstrained  stochastic  approximations",  1976  Decision 
and  Control  meeting,  St.  Petersberg,  Florida. 


SECURITY  CLASSIFICATION  OF  THIS  page  fH*./!  D«l«Enf*r»d u 

/ j~)  (IMPORT  DOCUMENTATION  PAGE  befo|AeDcomple™gNfoRm 

Cl»B6«- ~Z  b govt  ACCESSION  NO.  3 RECIPIENT’S  CATALOG  NUMBER 

SisHi.  7 e'-jTT^l  ^ 


V'y  CONVERGENCE  OF  RECURSIVE  ADAPTIVE  AND  V 

IDENTIFICATION  PROCEDURES' VIA  WEAK  CONVERGENCE] 

■ J THEORY,  ''  '' | s pew 


OF  REPORT  A PERIOD  COVERED 

y : ~z  1 

Interim 


7— AUTHORr*; 

r — “7 

^Harold  J./Kushner  ^ 


9.  PERFORMING  ORGANIZATION  NAME  AND  AODRESS 

Brown  University,  Lefschetz  Ctr  for  Dynamical v 

Systems,  Div  of  Applied  Mathematics 

Providence,  RI  02912 

II.  CONTROLLING  OFFICE  NAME  AND  ADDRESS  /7 

Air  Force  Office  of  Scientific  Research/NM  f // 

Bolling  AFB,  WAshington,  DC  20332  * 


a r^TBArr  nn  riBill"  ‘T"°r,)r ~ 

//'PPPS-V- 

4F-  AFdSR^-^S-?^. 


10.  PROGRAM  ELEMENT.  PROJECT,  TASK 
AREA  A WORK  UMBERS 

6I102F^3^4KA1 


rUA. 


A-  unuiToPiun  Ar.FMCv  utuF  A AnORFS^lf  different  from  Controlling  Otfice)  I 15.  SECURITY  CLASS,  (of  thim  import) 


UNCLASSIFEID 

15a.  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


I 16.  DISTRIBUTION  STATEMENT  (of  thlm  Rmport) 


Approved  for  public  release;  distribution  unlimited. 


I 17.  DISTRIBUTION  STATEMENT  (of  thm  mbmtrmct  mntmrmd  In  Block  20,  If  different  from  Report) 


18.  SUPPLEMENTARY  NOTES 


1 19.  KEY  WORDS  (Continue  on  revmrmm  midm  It  nmcmmmmry  and  Identify  by  block  number) 


20.  ABSTRACT  (Continue  on  rmmmrmm  midm  If  nmcmmmmry  mnd  Identify  by  block  number) 

->  Results  and  concepts  in  the  theory  of  weak  convergence  of  a 
sequence  of  probability  measures  are  applied  to  convergence  problei  is 
for  a variety  of  recursive  adaptive  (stochastic  approximation  like  ) 
methods.  Similar  techniques  have  had  wide  applicability  in  areas 
of  operations  research  and  in  some  other  areas  in  stochastic 
control.  It  is  quite  likely  that  they  will  play  a much  more  i 

important  role  in  control  theory  than  they  do  at  present,  since— ^ 

- ..... ^ ^ ■ -JPW 


dd  rs, 


1473  EDITION  OF  I NOV  69  IS  OBSOLETE 


065 


UNCLASSIFIED  L /(7J  O CXGV 

SECURITY  CLASSIFICATION  OF  THIS  PAGE  fMim  Dmf  Enl.r.J)^ 


UNlU^ilblU 

jtCUWITV  CLASSIFICATION  OF  THIS  PAOEfTWll  Dm  la  Enlmrmd) 


20  Abstract 

c»r\"t 


allow  relatively  simple  and  natural  proofs  for  many  types  of 

Z°™rr?en?\an?  approximation  problems.  Part  of  the  ail  of  the 

mfahf  K tUt°fial:  to  introduce  the  ideas,  and  to  show  how  thJy 

al?  6 aPPlt^d‘j  Also'  many  of  the  results  are  new,  and  they  car 
all  be  generalized  in  many  directions.  y 


UNCLASSIFIED 


SECURITY  CLAIMFICATIO*  O*  *AGEfW»>««i  Dmtm  Bntmrmd) 


